Search CORE

141 research outputs found

The University of Edinburgh's submission to the German-to-English and English-to-German Tracks in the WMT 2020 News Translation and Zero-shot Translation Robustness Tasks

Author: Germann Ulrich
Publication venue
Publication date: 19/11/2020
Field of study

Edinburgh Research Explorer

Leitlinien verantworteter Technik

Author: Germann Hans Ulrich
Korff Wilhelm
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/01/1993
Field of study

Open Access LMU

Character Mapping and Ad-hoc Adaptation: Edinburgh's IWSLT 2020 Open Domain Translation System

Author: Bogoychev Nikolay
Chen Pinzhen
Germann Ulrich
Publication venue
Publication date: 01/01/2020
Field of study

This paper describes the University of Edinburgh’s neural machine translation systems submitted to the IWSLT 2020 open domain Japanese Chinese translation task. On top of commonplace techniques like tokenisation and corpus cleaning, we explore character mapping and unsupervised decoding-time adaptation. Our techniques focus on leveraging the provided data, and we show the positive impact of each technique through the gradual improvement of BLEU

Crossref

Edinburgh Research Explorer

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Progress in ModernMT, a New Open-Source Machine Translation Platform for the Translation Industry

Author: Germann Ulrich
Publication venue
Publication date: 31/05/2017
Field of study

Edinburgh Research Explorer

Bilingual Document Alignment with Latent Semantic Indexing

Author: Germann Ulrich
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

We apply cross-lingual Latent Semantic Indexing to the Bilingual Document Alignment Task at WMT16. Reduced-rank singular value decomposition of a bilingual term-document matrix derived from known English/French page pairs in the training data allows us to map monolingual documents into a joint semantic space. Two variants of cosine similarity between the vectors that place each document into the joint semantic space are combined with a measure of string similarity between corresponding URLs to produce 1:1 alignments of English/French web pages in a variety of domains. The system achieves a recall of ca. 88% if no in-domain data is used for building the latent semantic model, and 93% if such data is included. Analysing the system's errors on the training data, we argue that evaluating aligner performance based on exact URL matches under-estimates their true performance and propose an alternative that is able to account for duplicates and near-duplicates in the underlying data.Comment: Proceedings of the First Conference on Machine Translation (2016), Volume 2: Shared Task Paper

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Dynamic Phrase Tables for Machine Translation in an Interactive Post-editing Scenario

Author: Germann Ulrich
Publication venue
Publication date: 01/01/2014
Field of study

Edinburgh Research Explorer

A Deterministic Dependency Parser for Japanese

Author: Germann Ulrich
Publication venue
Publication date: 01/01/1999
Field of study

We present a rule-based, deterministic dependency parser for Japanese. It was implemented in C ++, using object classes that reflect linguistic concepts and thus facilitate the transfer of linguistic intuitions into code. The parser first chunks morphemes into one-word phrases and then parses from the right to the left. The average parsing accuracy is 83.6%

CiteSeerX

Edinburgh Research Explorer

Thunderstorm nowcasting with deep learning: a multi-hazard data fusion model

Author: Germann Urs
Hamann Ulrich
Leinonen Jussi
Sideris Ioannis V.
Publication venue
Publication date: 02/11/2022
Field of study

Predictions of thunderstorm-related hazards are needed in several sectors, including first responders, infrastructure management and aviation. To address this need, we present a deep learning model that can be adapted to different hazard types. The model can utilize multiple data sources; we use data from weather radar, lightning detection, satellite visible/infrared imagery, numerical weather prediction and digital elevation models. It can be trained to operate with any combination of these sources, such that predictions can still be provided if one or more of the sources become unavailable. We demonstrate the ability of the model to predict lightning, hail and heavy precipitation probabilistically on a 1 km resolution grid, with a time resolution of 5 min and lead times up to 60 min. Shapley values quantify the importance of the different data sources, showing that the weather radar products are the most important predictors for all three hazard types.Comment: 15 pages, 3 figures. Submitted to Geophysical Research Letter

arXiv.org e-Print Archive

The Impact of Machine Translation Quality on Human Post-Editing

Author: Germann Ulrich
Koehn Philipp
Publication venue
Publication date: 01/04/2014
Field of study

Edinburgh Research Explorer

The SUMMA Platform:Scalable Understanding of Multilingual Media

Author: Barzdins Guntis
Birch-Mayne Alexandra
Germann Ulrich
van der Kreeft Peggy
Publication venue
Publication date: 01/01/2018
Field of study

We present the latest version of the SUMMA platform, an open-source software platform for monitoring and interpreting multi-lingual media, from written news published on the internet to live media broadcasts via satellite or internet streaming.This work was conducted within the scope of the Research and Innovation Action SUMMA, which has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688139

Repositorio Institucional de la Universidad de Alicante

Edinburgh Research Explorer

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY